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Abstract 

Background: Through social interactions, individuals affect one another's phenotype. In such cases, an individual's 
phenotype is affected by the direct (genetic) effect of the individual itself and the indirect (genetic) effects of the 
group nnates. Using data on individual phenotypes, direct and indirect genetic (co)variances can be estimated. 
Together, they connpose the total genetic variance that determines a population's potential to respond to selection. 
However, it can be difficult or expensive to obtain individual phenotypes. Phenotypes on traits such as egg 
production and feed intake are, therefore, often collected on group level. In this study, we investigated whether 
direct, indirect and total genetic variances, and breeding values can be estimated from pooled data (pooled by 
group). In addition, we determined the optimal group composition, i.e. the optimal number of families represented 
in a group to minimise the standard error of the estimates. 

Methods: This study was performed in three steps. First, all research questions were answered by theoretical 
derivations. Second, a simulation study was conducted to investigate the estimation of variance components and 
optimal group composition. Third, individual and pooled survival records on 12 944 purebred laying hens were 
analysed to investigate the estimation of breeding values and response to selection. 

Results: Through theoretical derivations and simulations, we showed that the total genetic variance can be 
estimated from pooled data, but the underlying direct and indirect genetic (co)variances cannot. Moreover, we 
showed that the most accurate estimates are obtained when group members belong to the same family. Additional 
theoretical derivations and data analyses on survival records showed that the total genetic variance and breeding 
values can be estimated from pooled data. Moreover, the correlation between the estimated total breeding values 
obtained from individual and pooled data was surprisingly close to one. This indicates that, for survival in purebred 
laying hens, loss in response to selection will be small when using pooled instead of individual data. 

Conclusions: Using pooled data, the total genetic variance and breeding values can be estimated, but the underlying 
genetic components cannot. The most accurate estimates are obtained when group members belong to the same 
family. 



Background 

Group housing is common practice in most livestock 
farming systems. Previous studies have shown that 
group-housed animals can substantially affect one an- 
other's phenotype through social interactions [1-9]. The 
heritable effect of an individual on its own phenotype is 
known as the direct genetic effect, while the heritable ef- 
fect of an individual on the phenotype of a group mate 
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is known as the social, associative or indirect genetic ef- 
fect [10-14]. Both direct and indirect genetic effects de- 
termine a population s potential to respond to selection, 
i.e. the total genetic variance [2,10-14]. Selection experi- 
ments in laying hens and quail [1,2,9], and variance 
component estimates in laying hens, quail, beef cattle 
and pigs [3-9] have shown that indirect genetic effects 
can contribute substantially to the total genetic variation 
in agricultural populations. 

Direct, indirect and total genetic variances can be esti- 
mated from individual data. However, it can be difficult 
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or expensive to obtain individual phenotypes on certain 
traits, e.g. egg production and feed intake. Alternatively, 
data can be obtained on group level, resulting in pooled 
records. However, pooling data reduces the number of 
data points. Moreover, multiple animals influence each 
data point, increasing the complexity of the data. Al- 
though there is an obvious loss of power, previous stud- 
ies have shown that pooled data can be used to estimate 
direct genetic variances for traits not affected by social 
interactions [15-17]. However, with social interactions, 
indirect genetic effects emerge and the complexity of the 
data increases further. It is unclear whether pooled data 
are still informative in these situations. Therefore, the 
main objective of this study was to determine whether 
pooled data can be used to estimate direct, indirect and 
total genetic variances, and breeding values for traits af- 
fected by social interactions. In addition, optimal group 
composition was determined, i.e. the optimal number of 
families represented in a group to minimise the standard 
error of the estimates. 

Methods 

This study was performed in three steps. First, all re- 
search questions were answered by theoretical deriva- 
tions. Second, a simulation study was conducted to 
investigate the estimation of variance components and 
optimal group composition. Third, individual and pooled 
survival records on 12 944 purebred laying hens were 
analysed to investigate the estimation of breeding values 
and response to selection. 
Table 1 lists the main symbols and their meaning. 

Theory 

Variance components and breeding value estimation 

In this section, we examined whether direct, indirect 
and total genetic variances, and breeding values can be 
estimated from pooled data. 

With social interactions, an individual phenotype con- 
sists of the direct genetic (Ad) and environmental (Ed) 
effects of the individual itself (/), and the indirect genetic 
(Al) and environmental (Ei) effects of its group mates (/): 



Table 1 Notation key 



Pi = Ad,, + £d, + £ A- + £^1,, 



(1) 



where n is the number of individuals per group [11]. 
From an animal breeding perspective, the total breeding 
value (At) is of interest because it determines total re- 
sponse to selection. An animals Ax consists of a direct 
and indirect component: 



At, =Ad, + (f2-l)Ai^., 



(2) 



where Ad is expressed in the phenotype of the animal itself 
and Al is expressed in the phenotype of each group mate. 



Symbol 



Meaning 



I -J 

Ad 
Al 
Aj 
Ed 
El 

< 

o; 

-t2 



2 

'Cage 



Focal individual - Group mates of the focal individual 

Direct genetic effect \ Direct breeding value 

Indirect genetic effect \ Indirect breeding value 

Total genetic effect \ Total breeding value 

Direct environmental effect 

Indirect environmental effect 

Direct genetic variance 

Direct-indirect genetic covariance 

Indirect genetic variance 

Total genetic variance 

Cage variance 

Error variance 

Phenotypic variance 

Pooled error variance 

Pooled phenotypic variance 

Direct genetic variance relative to phenotypic 
variance \ Heritability 

Total genetic variance relative to phenotypic variance 

Full variance 

Between-family variance 

Within-family variance 

Relatedness within a family 

Number of families 

Number of records per family 

Family size 

Group size 

Hat, denotes estimated values 



A pooled record (P ) consists of the individual pheno- 
types of all group members (k): 



(3) 



k=l 



It follows from Equations (1) and (3) that, with social 
interactions, a pooled record consists of the Ad and £d 
of each group member, as well as their Ai and Ei that are 
expressed n - 1 times: 



k=l 



[Ad,+£d, + (^-1)(Ai,+£iJ]. 



(4) 



Because an animals Ad and Ai are expressed in the 
same pooled record, the direct Z-matrix that links 
pooled phenotypes to Ad's and the indirect Z-matrix 
that links pooled phenotypes to Ai s are completely con- 
founded (as shown in Appendix A by using a fictive 
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example (Table 8)). Consequently, direct and indirect 
(co)variances, and breeding values cannot be estimated 
from pooled data. 

It follows from Equations (2) and (4) that, with social 
interactions, a pooled record contains the total genetic 
effect of each group member: 

P* = J2{Aj,+E,). (5) 

k=l 

Equation (5) shows strong similarities with: 



Table 2 Within-family variance (a^) and number of 
records per family (m) for three group compositions 



family 1 [oj^ + 2(n-l)op„ + {n-]f + {n-])ro%yro% o/n 
fem'llies K + 2(n-l)op„ + (n-1)^ o^, + {^-^)ro%)-ro% 2o/n 

n families n (^oj^ + 2(n-1) Opp, + {n-]foj^-rol_^ 

r, N, n, o and ol do not differ between group compositions. 



(6) 



k=l 



which shows the content of a pooled record when social 
interactions do not occur. Previous studies have shown 
that pooled data can be used to estimate direct genetic 
variances (cr^^) and direct breeding values for traits that 
are not affected by social interactions [15-17]. Similarly, 
pooled data can be used to estimate total genetic vari- 
ances (cr^^) and total breeding values for traits that are 
affected by social interactions. 

Optimal group composition 

In this section, the standard error (s.e.) of is derived 
for three experimental designs that differ with respect to 
group composition, i.e. group members belonged to ei- 
ther one, two or n families. The s.e. of an estimate of the 
genetic variance depends on the between- (a^) and 
within-family variance (a^ , the relatedness within a 
family (r), the number of families (A/), and the number 
of records per family (m) [18]: 



s.e.(a^). 




+ 



of., 



m m{m-l) 



(7) 



Analysis of variance was used to derive ol and for 
each design (see Appendix B for derivation). 

The s.e. of differs between experimental designs 
because the group composition changes the within- 
family variance and the number of records per family 
(Table 2). On the one hand, the within-family variance 
decreases when the number of families per group de- 
creases, causing a strong decrease in s.e.. On the other 
hand, the number of records per family decreases when 
the number of families per group decreases, causing a 
slight increase in s.e.. Overall, to obtain the most accur- 
ate estimate of , group members should belong to 
the same family. The only exception is when family size 
(o) equals group size {n). In this case, there is only one 
record per family and o\ would not be estimable. 



Ideally, group members should be full sibs rather than 
half sibs, since an increase in relatedness causes a de- 
crease in the s.e. of d^^. 



Simulation 

To validate the theoretical derivations, a simulation study 
was conducted in R v2.12.2 [19]. A base population of 500 
sires and 500 dams was simulated. Each animal in the base 
population was assigned a direct and indirect breeding 



value, drawn from N( 



( 


■ o" 








) 




0 


1 









The ol 



and o\^ were set to 1.00, and Oa^^ was set to -0.50, 0.00 or 
0.50. Each sire was randomly mated to a single dam, 
resulting in 12 offspring per mating for a total of 6000 
simulated offspring. For each offspring, direct and indirect 
breeding values were obtained as: Aj) = ^Aj)^ +| ^Dd + 
MSj) and Ai = ^A^ + ^ Ai^ + where the direct and 
indirect Mendelian sampling terms were drawn from 



DI 



Each offspring was also 



assigned a direct and indirect environmental value, drawn 
from N 



"o" 






) 


0 









I- The 02 



and a|j were set 



to 2.00, and Oe^, was set to -1.00, 0.00 or 1.00. Animals 
were placed in groups of four. Depending on the scenario, 
group members belonged to one, two or four families. In- 
dividual phenotypes were obtained by summing the direct 
and indirect genetic and environmental components 
according to Equation (1). Pooled records were obtained by 
summing individual phenotypes according to Equation (3). 
Seven scenarios were simulated, which differed in Oaj^^> o^^j^j 
or group composition (Table 3). For each scenario, 100 rep- 
licates were produced. 

Based on the previous section, expectations are that 
the use of a direct-indirect animal model for pooled 
data will fail to differentiate between direct and indirect 
genetic effects, while the use of a traditional animal 
model for pooled data will yield estimates of o\^ . To val- 
idate these theoretical predictions, both models were 
run. First, the simulated pooled records were analysed 
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Table 3 Scenarios used to simulate data 





Scenario 






Group 
composition 


Reference scenario 


1 


0.00 


0.00 


Four families 




2 


-0.50 


0.00 


Four families 


Different O/^p, 












3 


0.50 


0.00 


Four families 




4 


0.00 


-1.00 


Four families 


Different Of^, 












5 


0.00 


1.00 


Four families 




6 


0.00 


0.00 


Two families 


Different group compositions 








One family 


7 


0.00 


0.00 



a^^ and o^^ were set to 1.00; oj^ and oj^ were set to 2.00. 

with the following direct-indirect animal model in 
ASReml v3.0 [20]: 



■ Z^ao + Zjai + e*, 



(8) 



where y is a vector that contains pooled records (P ); \i 
is a vector that contains the pooled mean; is an inci- 
dence matrix linking the pooled records to A^s (each 
pooled record was linked to the Aj^'s of the four group 
members); ao is a vector that contains Aj^s; Zj is an in- 
cidence matrix linking the pooled records to A{s (each 
pooled record was linked to the AiS of the four group 
members); ai is a vector that contains A{s; and e is a 
vector that contains residuals. Second, the simulated 
pooled records were analysed with the following trad- 
itional animal model in ASReml v3.0 [20]: 



y* = (1* +Z*a + e*, 



(9) 



where y , |i and e are as explained above; Z is an inci- 
dence matrix linking the pooled records to As (each 
pooled record was linked to the As of the four group 
members); and a is a vector that contains As, 

Based on the previous section, expectations are that 
the most accurate prediction of a^^ will be obtained 
when group members belong to the same family. To val- 
idate this theoretical prediction, the predicted s.e. of d^^ 
was compared to (i) the standard deviation (s.d.) of 100 
estimates of (^Aj^ reported by ASReml) and (ii) the 
mean of 100 s.e.s of (s.e.s reported by ASReml) for 
three group compositions (scenarios 1, 6 and 7 of Table 3). 

Data analyses 

The dataset was part of the pre-existing database of 
Hendrix Genetics (The Netherlands) and contained rou- 
tinely collected data for breeding value estimation. Ani- 
mal Care and Use Committee approval was therefore 
not required. 

To validate the theoretical derivations and to gain 
insight into response to selection, individual and pooled 
data on survival in purebred laying hens {G alius gallus) 



were analysed. Survival in group-housed laying hens is a 
well-known example of a trait affected by social interac- 
tions, since a birds chance to survive depends on the 
feather pecking and cannibalistic behaviour of its group 
mates. Ellen et al. [5] used individual survival data on 
three purebred lines to estimate direct and indirect gen- 
etic (co)variances. Large and statistically significant in- 
direct genetic effects were found in two out of three 
purebred lines. In the current study, we used data from 
the same two lines. Data were provided by the "Institut 
de Selection Animale B.V.", the layer breeding division 
of Hendrix Genetics. Data on 13 192 White Leghorn 
layers were provided of which 6276 were of line Wl and 
6916 were of line WB. 

At the age of 17 weeks, the hens were placed in two 
laying houses. The laying houses consisted of four or five 
double rows, and each row consisted of three levels. 
Interaction with neighbours on the back of the cage was 
possible, but interaction with neighbours on the side 
was prevented. Four hens of the same purebred line 
were randomly assigned to each cage. Hens were not 
beak-trimmed. Further details on housing conditions 
and management are in Ellen et al. [5]. 

The individual phenotype was defined as the number 
of days from the start of the laying period until either 
death or the end of the experiment, with a maximum of 
398 days. The individual phenotypes were summed per 
cage to obtain pooled records. If one individual pheno- 
type was missing, the entire cage was omitted from the 
analysis. The final dataset contained records on 6092 
Wl and 6852 WB hens. 

To obtain the direct, indirect and total genetic param- 
eters for survival time, the individual phenotypes were 
analysed with the following direct-indirect animal 
model in ASReml v3.0 [20]: 



y = Xb + ZoaD + Zjaj + Vcage + e, 



(10) 



where y is a vector that contains individual phenotypes; 
X is an incidence matrix linking the individual pheno- 
types to fixed effects; b is a vector that contains fixed ef- 
fects, which included an interaction term for each laying 
house by row by level combination, an effect for the 
content of the back cage (full/empty) and a covariate for 
the average number of survival days in the back cage; Z^ 
is an incidence matrix linking the individual phenotypes 
to Ad's; ao is a vector that contains s; Zj is an inci- 
dence matrix linking the individual phenotypes to A{s) ai 
is a vector that contains A{s; V is an incidence matrix 
linking the individual phenotypes to random cage ef- 
fects; cage is a vector that contains random cage effects 
(to account for the non-genetic covariance among phe- 
notypes of cage members [21]); and e is a vector that 
contains residuals. This model yields estimates of , 
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a^j3j and o\^, from which can be calculated. Simi- 
larly, it yields estimates of A^s and A{s, from which Ajs 
can be calculated. To improve a trait, animals should be 
selected based on their Aj, since determines a popu- 
lation s potential to respond to selection. 

Alternatively, a traditional animal model can be used 
to analyse individual or pooled data. A traditional animal 
model on individual data only yields estimates of a^^ 
and Ad's. A traditional model on pooled data is expected 
to yield estimates of a^^ and A^s, but not of a^^ and 
Ad's. To validate this theoretical prediction, these trad- 
itional models were also run. First, the individual pheno- 
types were analysed with the following traditional 
(direct) animal model in ASReml v3.0 [20] : 



y = Xb + ZoaD + Vcage + e, 



(11) 



where y, X, b, Z^, V, cage and e are as explained 
above. Second, the pooled records were analysed with 
the following traditional animal model in ASReml v3.0 
[20]: 



X*b* + Z*a + e* 



(12) 



where y is a vector that contains pooled records (P ); X 
is an incidence matrix linking the pooled records to 
fixed effects; b is a vector that contains fixed effects (the 
same fixed effects as mentioned above); Z is an inci- 
dence matrix linking the pooled records to As (each 
pooled record was linked to the As of the four group 
members); a is a vector that contains As; and e is a vec- 
tor that contains residuals. 

The estimated variance components and breeding 
values of all three models were compared. In addition, 
we calculated the loss in response to selection that 
would occur when applying a traditional model to indi- 
vidual or pooled data instead of a direct-indirect model 



to individual data. The direct- indirect model applied to 
individual data yielded estimates of and A^s, Based 

on their Aj, 250 animals were selected and the corre- 
sponding response to selection was calculated. Similarly, 
for the two traditional animal models, 250 animals were 
selected based on their Aj) (obtained from individual 
data) and A (obtained from pooled data). Once the top 
250 animals were selected, their Aj (obtained from indi- 
vidual data) was used to calculate the total response to 
selection. Then, the loss in total response to selection 
was calculated. 

Results and discussion 

Simulation 

The direct-indirect animal model on pooled records 
failed to converge, confirming that direct and indirect 
(co)variances cannot be estimated from pooled data. The 
traditional animal model on pooled records yielded esti- 
mates of o\ and a|*. These estimates did not differ sig- 
nificantly from the true a^^ and a|* (Table 4), where 



2(^-1) ct^oi + (^-1)'ct; 



(derived by [14]) and 
(analogous to [17]). 



(13) 



(14) 



Based on Equation (7), the s.e. of was predicted 
for three scenarios that differed in group composition, 
i.e. group members belonged to one, two or four fam- 
ilies. The theoretical s.e. of was compared to (i) the 
s.d. of 100 estimates of (o^l^ s reported by ASReml) 
and (ii) the mean of 100 s.e.s of (s.e. s reported by 
ASReml) (Table 5). The theoretical s.e. of did not 



Table 4 True and estimated aj^ and a|* 


for five scenarios 










Scenario^ 




a^is.e. 




a|*±s.e. 


Oao, = 0.00 


1 


10.00 


10.10±1.85 


80.00 


80.56 ± 6.69 


Of,, = 0.00 












Oao^ = -0.50 


2 


7.00 


7.43 ±1.59 


80.00 


79.29 ± 6.08 


Of,, = 0.00 












Oao, = 0.50 


3 


13.00 


13.05 ±2.1 2 


80.00 


80.32 ± 7.30 


Of,, = 0.00 












0/\„ = 0.00 


4 


10.00 


9.70 ± 1 .54 


56.00 


56.54 ±5.24 


Of,, =-1.00 












0/\„ = 0.00 


5 


10.00 


9.81 ±2.10 


104.00 


104.71 ±8.03 


Of,, = 1.00 













and were set to 1.00; oj^ and oj^ were set to 2.00; group members belonged to four different families. 

= +2{n-^) Oa^, + {n-^f o^. 
. =n[oi^+2(n-1)Of„ + (n-1)^o2]. 
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Table 5 Theoretically predicted s.e. (^^5t)' (^^t) ^ ^ ® (^5t) three group compositions 





Scenario^^^ 




s.d.(a^^)±s.d. 


s.e.(a2J±s.d. 


Four families 


1 


1.88 


2.01 ±0.14 


1.85 ±0.13 


Two families 


6 


1.30 


1 .23 ± 0.09 


1 .23 ± 0.08 


One family 


7 


0.92 


0.81 ±0.06 


0.92 ± 0.05 



' s.d.faJJ based on 100 o^/s reported by ASReml. 



s.e.(6^J based on 100 s.e.'s reported by ASReml. 

o^^ and were set to 1 .00; o^^i set to 0.00; oj^ and oj^ were set to 2.00; Ofp, was set to 0.00. 



differ significantly from the values obtained by simula- 
tion. Moreover, as predicted, the most accurate estimate 
of a^^ was obtained when group members belonged to 
the same family. In comparison, the s.e. of d^^ was twice 
as large when group members belonged to different fam- 
ilies. This indicates that group composition is crucial 
when aiming to obtain accurate estimates. 

Data analyses 

Table 6 shows the estimated variance components for 
individual survival data analysed with a direct-indirect 
animal model, and the estimated variance components 
for individual and pooled survival data analysed with a 
traditional animal model. The direct- indirect animal 
model on individual data yielded estimates of o\^, a^i^j 
and o\^. Based on these components, was calculated 
(according to Equation (13)). The traditional animal 
model on individual data yielded estimates of . The 
traditional animal model on pooled data yielded esti- 
mates of o\ that closely resembled the estimates of 
from individual data. The direct-indirect animal model 
on individual data also yielded estimates of a^^^^ and a|. 
As derived by Bergsma et al. [21], d^^^^ is an estimate of 
2cr£j3j + (f2-2)a|j. As derived by Bijma [22], d| is an esti- 
mate of a|j^-2a£j3j + a|j. As shown in Equation (14), d|* is 

an estimate of n |^a|^ + 2{n-l) Oe^^ + (^-1)^ cr|^j • Conse- 
quently, the d^^^g and d| from the direct-indirect animal 
model on individual data should sum to the d|* from the 
traditional animal model on pooled data. More precisely: 



(15) 



The expected d|* , calculated based on the d^^^^ and 
d| from the direct- indirect animal model on individual 
data, and the d|* from the traditional animal model on 
pooled data closely resembled each other. 

Table 6 does not show heritability estimates. Where the 
classical heritability (h^) is used to express relative to 
the phenotypic variance (a|), is used to express o\ 



relative to [21]. Comparing values of obtained from 
individual and pooled data would be misleading because 
they are not expected to be similar. Unlike for a trait that 
is not affected by social interactions, a^* cannot simply be 
divided by the number of group members to obtain a^. 
When group members are unrelated. 



(^L + (n-l)a; 



and 



+ a|. 



2„2 



(16) 



(17) 



+a|^+2(n-l)a£,, + (n-l)2a|J. 



The non-proportional increase of does not enable a 
meaningful comparison between values of obtained 
from individual and pooled data. 

In conclusion, when group members are unrelated, 
a traditional animal model on individual data yields 

Table 6 Estimated variance components (with s.e.) from 
individual and pooled data on survival in laying hens 





W1 


WB 




Direct-indirect animal model on individual data 






< 


705 (±171) 


1404 (i 


: 301) 




59 (±61) 


-162 (d 


b 105) 


< 


104 (±41) 


232 (± 


72) 


^Cage 


799 (± 166) 


1191 (i 


: 238) 




7980 (±210) 


12 675 


(± 365) 




1996 (± 640) 


2521 (i 


: 842) 


Expected Op^^ 


44 700 (± 2526) 


69 752 


(± 3513) 


Traditional (direct) animal model on individual data 






< 


677 (± 165) 


1522 (i 


: 317) 


^Cage 


1096 (± 127) 


1443 (i 


: 186) 


o\ 


8002 (± 205) 


13 008 


(± 338) 


Traditional animal model on pooled data 








1979 (± 643) 


2521 (i 


: 845) 


0]. 


44 750 (± 2538) 


69 750 


(± 3519) 



In groups of four, ol equals + 6 Oaq^ 

' In groups of four, o\* equals 16 o^^ge + 4 o\. 
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estimates of , while a traditional animal model on 
pooled data yields estimates of . Moreover, the esti- 
mated cage and error variances from a direct-indirect 
animal model on individual data sum to the pooled error 
variance from a traditional animal model on pooled data. 
This result could explain the 'inconsistencies' found by 
Biscarini et al. [17], who assumed that a traditional ani- 
mal model on individual and pooled data should yield 
the same genetic variance. Moreover, Biscarini et al. [17] 
expected to find a pooled error variance that is four 
times larger than the individual error variance. For body 
weight at the age of 19 and 27 weeks, these expectations 
were met. For body weight at the age of 43 and 51 
weeks, however, the genetic variance estimated from 
pooled data was smaller than expected, while the pooled 
error variance was larger than expected. Biscarini et al. 
[17] mentions the emergence of competition effects as a 
possible cause. We indeed expect to find indirect genetic 
effects when the individual data on body weight at the 
age of 43 and 51 weeks were reanalysed with a direct- 
indirect animal model. Using Equations (13) and (15), 
the estimated variance components from individual data 
would resemble the estimated variance components 
from pooled data. 

The regression coefficients of Ad s obtained from 
individual data on the A s obtained from pooled 
data strongly deviated from one (0.363 ± 0.006 for Wl; 
0.392 ± 0.010 for WB). The regression coefficients of s 
obtained from individual data on the A s obtained from 
pooled data were close to, and not significantly different 
from, one (1.004 ±0.003 for Wl; 1.001 ±0.001 for WB). 
This indicates that the A s obtained from pooled data are 
unbiased estimates of the Ajs obtained from individual 
data. 

Table 7 shows Spearman correlation coefficients be- 
tween Ay)S and Aj's obtained from individual data and 
the As obtained from pooled data. The Spearman cor- 
relation coefficients between the Ajs obtained from in- 
dividual data and the A s obtained from pooled data 
were close to, but significantly different from, one. This 
indicates only a minor loss in the accuracy of Aj's when 
using pooled instead of individual data, which will be 

Table 7 Spearman correlation coefficients between Ao's 
and ^t's obtained from individual data and ^'s from 



pooled data on survival in laying hens 


Ad 


Aj 


A 


Ad 


0.513 (± 0.001) 


0.412 (± 0.001) 


At 0.725 (± 0.001) 




0.992 (± 0.001) 


A 0.543 (± 0.001) 


0.967 (± 0.001) 





Spearman correlation coefficients for data on Wl hens below the diagonal 
and for data on WB hens above the diagonal. 



reflected in a minor loss in response to selection when 
using pooled instead of individual data. 

To gain more insight, we calculated the loss in re- 
sponse to selection that occurs when applying a trad- 
itional model to individual or pooled data instead of a 
direct-indirect model to individual data. When applying 
a traditional model to individual data, the loss in total 
response to selection was 46.9% for Wl (Figure lA) and 
54.9% for WB (Figure IC). When applying a traditional 
model to pooled data, the loss in total response to se- 
lection was 3.3% for Wl (Figure IB) and 0.3% for WB 
(Figure ID). In conclusion, the loss in total response to 
selection will be large when using a traditional animal 
model on individual data, but will be small when using 
a traditional animal model on pooled data. However, 
this outcome may be specific to this dataset. Survival in 
purebred laying hens was recorded in cages with four 
unrelated birds. Both direct and indirect genetic effects 
strongly influenced the trait. Group size, group compos- 
ition, and the relative impact of direct and indirect gen- 
etic effects might influence the loss in total response to 
selection. For example, for body weight at 19 and 27 
weeks of age, indirect genetic effects are expected to be 
small. In that case, an animals is mainly expressed 
in the phenotype of the animal itself. Consequently, we 
expect that more accurate estimated breeding values 
can be obtained when using individual instead of pooled 
data. Biscarini et al. [17] found a correlation of ~ 0.75 
between the estimated breeding values based on individ- 
ual and pooled data, resulting in a large loss in response 
to selection when using pooled instead of individual 
data. Thus, using pooled data does not always seem to 
be a proper alternative and requires further research. 

Conclusions 

Using pooled data, the total genetic variance and breed- 
ing values can be estimated, but the underlying direct 
and indirect genetic (co)variances and breeding values 
cannot. The most accurate estimates are obtained when 
group members belong to the same family. While quan- 
tifying the direct and indirect genetic effects is interest- 
ing from a biological perspective, obtaining the total 
genetic effect is most important from an animal breed- 
ing perspective. When it is too difficult or expensive to 
obtain individual data, pooled data can be used to im- 
prove traits. 

Appendix A 

This section demonstrates why direct and indirect 
(co)variances can be estimated from individual data, but 
cannot be estimated from pooled data. 

Consider a situation where four base parents produce 
six offspring. Animals are kept in groups of two and 
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Figure 1 ^t's obtained from individual data plotted against >4d's obtained from Individual data and ^'s obtained from pooled data on 
survival In laying hens. A and B for data on Wl hens. C and D for data on WB hens. AGi represents the total response to selection when 
selecting animals based on their /4d obtained from individual data or A obtained from pooled data. AG2 represents the total response to 
selection when selecting animals based on their /Aj obtained from individual data. 



individual phenotypes are recorded on all six offspring 
(Table 8), 

When analysing individual data with a direct-indirect 
animal model, the Z-matrices would be: 



"0 
0 


0 
0 


0 
0 


0 
0 


1 

0 


0 

1 


0 
0 


0 
0 


0 
0 


0" 
0 


Table 8 Example pedigree structure and group composition 


Animal 


Sire 


Dam 


Phenotype 


Group 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


1 
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0 
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0 
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Zd and Zj are not identical, indicating that the direct 
and indirect genetic effects are estimated based on dif- 
ferent information sources, enabling the model to distin- 
guish between these two effects. 

When analysing pooled data with a direct-indirect 
animal model, the Z-matrices would be: 
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0 
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0 
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1 
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0 


0 


0 


0 


1 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


1 



Zp and Zj are identical, indicating that the direct and 
indirect genetic effects are estimated based on the same 
information source, causing complete confounding be- 
tween direct and indirect genetic effects. The model 
will not be able to distinguish between these two 
effects. 

Appendix B 

Components of variance are determined by analysis of 
variance, where the full variance (a^) is partitioned into 
a between- (a^) and within-family component (cr^). In 
this section, the derivation of , and are 
presented for three group compositions. 



(i) When the group is composed of only one family, 
the At of a family is expressed n times in the 
same pooled record. Therefore, the record of 
interest is P In, 



nin-l)r (a^^ + 2(^-1) ga,, + {n-lf o^) 
- \ (< + 2(«-l)(Tp,, + {n-\f g\ + (n-\)r 



At 



(ii)When the group is composed of two families, the 
Ax of a family is expressed n/2 times in the same 



pooled record. Therefore, the record of interest 
is 2P/n, 



+ - 



4 



2 

At 



(iii) When the group composition is random, the Ax of 
a family is only expressed once per pooled record. 
Therefore, the record of interest is P , 



al = a%=n (al^ + 2{n-l) Op,, + {n-lf 

2 2 

< = " (< + 2(«-l) cTp„, + (n-lf a|,) -r o^^ 
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